超越相似度
當「80%問題」發生在基礎語義搜尋對簡單查詢有效,但在邊際情況下失敗時。僅依賴相似度進行搜尋時,向量資料庫通常會回傳數值上最接近的片段。然而,若這些片段幾乎完全相同,大語言模型(LLM)將收到重複資訊,浪費有限的上下文空間,錯過更廣闊的視角。
進階檢索支柱
- 最大邊際相關性(MMR):不只選擇最相似的項目,MMR 在相關性與多樣性之間取得平衡,以避免重複。$$MMR = \text{argmax}_{d \in R \setminus S} [\lambda \cdot \text{sim}(d, q) - (1 - \lambda) \cdot \max_{s \in S} \text{sim}(d, s)]$$
- 自我查詢:利用大語言模型(LLM)將自然語言轉換為結構化元數據篩選條件(例如按「第3講」或「來源:PDF」篩選)。
- 上下文壓縮:縮小檢索到的文件,僅提取與查詢相關的「高營養值」片段,節省令牌。
冗餘陷阱
給大語言模型(LLM)提供同一段落的三個版本,並不會讓它更聰明——只會使提示更昂貴。多樣性是打造「高營養值」上下文的關鍵。
TERMINALbash — 80x24
> Ready. Click "Run" to execute.
>
Knowledge Check
You want your system to answer "What did the instructor say about probability in the third lecture?" specifically. Which tool allows the LLM to automatically apply a filter for
{ "source": "lecture3.pdf" }?Challenge: The Token Limit Dilemma
Apply advanced retrieval strategies to solve a real-world constraint.
You are building a RAG system for a legal firm. The documents retrieved are 50 pages long, but only 2 sentences per page are actually relevant to the user's specific query. The standard "Stuff" chain is throwing an
OutOfTokens error because the context window is overflowing with irrelevant text.
Step 1
Identify the core problem and select the appropriate advanced retrieval tool to solve it without losing specific nuances.
Problem: The context window limit is being exceeded by "low-nutrient" text surrounding the relevant facts.
Tool Selection:
Tool Selection:
ContextualCompressionRetrieverStep 2
What specific component must you use in conjunction with this retriever to "squeeze" the documents?
Solution: Use an
LLMChainExtractor as the base for your compressor. This will process the retrieved documents and extract only the snippets relevant to the query, passing a much smaller, highly concentrated context to the final prompt.